Stochastic Dual Coordinate Descent with Alternating Direction Multiplier Method

نویسنده

  • Taiji Suzuki
چکیده

A. Derivation of the proximal operation for the smoothed hinge loss By the definition of the smoothed hinge loss, we have that, for −1 ≤ y i v ≤ 0, f * i (v) = sup u∈R {uv − f i (u)} = sup u∈R uv − 1 2 (1 − y i u)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers

We propose a new stochastic dual coordinate ascent technique that can be applied to a wide range of regularized learning problems. Our method is based on Alternating Direction Method of Multipliers (ADMM) to deal with complex regularization functions such as structured regularizations. Our method can naturally afford mini-batch update and it gives speed up of convergence. We show that, under mi...

متن کامل

Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized loss minimization problems. It still lacks of efforts in studying them in a distributed framework. We ...

متن کامل

Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method

We develop new stochastic optimization methods that are applicable to a wide range of structured regularizations. Basically our methods are combinations of basic stochastic optimization techniques and Alternating Direction Multiplier Method (ADMM). ADMM is a general framework for optimizing a composite function, and has a wide range of applications. We propose two types of online variants of AD...

متن کامل

Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers

Linear classification has demonstrated success in many areas of applications. Modern algorithms for linear classification can train reasonably good models while going through the data in only tens of rounds. However, large data often does not fit in the memory of a single machine, which makes the bottleneck in large-scale learning the disk I/O, not the CPU. Following this observation, Yu et al....

متن کامل

Cis400/401 Final Project Report

The rise of ‘big data’ and large-scale machine learning has created an increasing need for distributed optimization.dual descent and the alternating Most of the current literature has focused on coordinate descent, a prominent distributed optimization technique, due to its simplicity and effectiveness. We focus on implementing two other optimization techniques distributed dual descent and the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014